Clustering 1 - dimensional continuous data
نویسنده
چکیده
How to derive meaningful information from continuous data is the central topic of this work. When age data about 10 000 or 100 000 people is given, how can we represent the data in a way that is understandable to the analyst and also reveals signi cant metainformation about the people included in the datasets? In this work I am reviewing di erent clustering algorithms and proposing the most suitable for inducing previously unknown information.
منابع مشابه
High-Dimensional Unsupervised Active Learning Method
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...
متن کاملMulticlass Spectral Clustering
We propose a principled account on multiclass spectral clustering. Given a discrete clustering formulation, we first solve a relaxed continuous optimization problem by eigendecomposition. We clarify the role of eigenvectors as a generator of all optimal solutions through orthonormal transforms. We then solve an optimal discretization problem, which seeks a discrete solution closest to the conti...
متن کاملDeep Continuous Clustering
Clustering high-dimensional datasets is hard because interpoint distances become less informative in high-dimensional spaces. We present a clustering algorithm that performs nonlinear dimensionality reduction and clustering jointly. The data is embedded into a lower-dimensional space by a deep autoencoder. The autoencoder is optimized as part of the clustering process. The resulting network pro...
متن کاملEfficient Information Theoretic Clustering on Discrete Lattices
We consider the problem of clustering data that reside on discrete, low dimensional lattices. Canonical examples for this setting are found in image segmentation and key point extraction. Our solution is based on a recent approach to information theoretic clustering where clusters result from an iterative procedure that minimizes a divergence measure. We replace costly processing steps in the o...
متن کاملFast and Robust Subspace Clustering Using Random Projections
Over the past several decades, subspace clustering has been receiving increasing interest and continuous progress. However, due to the lack of scalability and/or robustness, existing methods still have difficulty in dealing with the data that possesses simultaneously three characteristics: high-dimensional, massive and grossly corrupted. To tackle the scalability and robustness issues simultane...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014